Automatic Document Summarization by Sentence Extraction
نویسنده
چکیده
Представлен метод автоматического реферирования документов, который генерирует резюме документа путем кластеризации и извлечения предложений из исходного документа. Преимущество предложенного подхода в том, что сгенерированное резюме документа может включать основное содержание практически всех тем, представленных в документе. Для определения оптимального числа кластеров введен критерий оценки качества кластеризации.
منابع مشابه
Biogeography-Based Optimization Algorithm for Automatic Extractive Text Summarization
Given the increasing number of documents, sites, online sources, and the users’ desire to quickly access information, automatic textual summarization has caught the attention of many researchers in this field. Researchers have presented different methods for text summarization as well as a useful summary of those texts including relevant document sentences. This study select...
متن کاملA survey on Automatic Text Summarization
Text summarization endeavors to produce a summary version of a text, while maintaining the original ideas. The textual content on the web, in particular, is growing at an exponential rate. The ability to decipher through such massive amount of data, in order to extract the useful information, is a major undertaking and requires an automatic mechanism to aid with the extant repository of informa...
متن کاملResults of CRL/NYU System at DUC-2003 and an Experiment on Division of Document Sets
We participated in three multi-document summarization tasks at the DUC-2003 formal run and evaluated the performance of our summarization system. Our summarization system based on sentence extraction also incorporated a module to estimate similarity between sentences for multi-document summarization. The similarity information was used for selecting the representative sentence among similar sen...
متن کاملNTT/NAIST's Text Summarization Systems for TSC-2
In this paper, we describe the following two approaches to summarization: (1) only sentence extraction, (2) sentence extraction + bunsetsu elimination. For both approaches, we use the machine learning algorithm called Support Vector Machines. We participated in both Task-A (single-document summarization task) and Task-B (multi-document summarization task) of TSC-2.
متن کاملSimilarity-based Multilingual Multi-Document Summarization
We present a new approach for summarizing clusters of documents on the same event, some of which are machine translations of foreign-language documents and some of which are English. Our approach to multilingual multi-document summarization uses text similarity to choose sentences from English documents based on the content of the machine translated documents. A manual evaluation shows that 68%...
متن کاملCentroid-based summarization of multiple documents: sentence extraction utility-based evaluation, and user studies
We present a multi-document summarizer, called MEAD, which generates summaries using cluster centroids produced by a topic detection and tracking system. We also describe two new techniques, based on sentence utility and subsumption, which we have applied to the evaluation of both single and multiple document summaries. Finally, we describe two user studies that test our models of multi-documen...
متن کامل